SD-Map - A Fast Algorithm for Exhaustive Subgroup Discovery
نویسندگان
چکیده
In this paper we present the novel SD-Map algorithm for exhaustive but efficient subgroup discovery. SD-Map guarantees to identify all interesting subgroup patterns contained in a data set, in contrast to heuristic or samplingbased methods. The SD-Map algorithm utilizes the well-known FP-growth method for mining association rules with adaptations for the subgroup discovery task. We show how SD-Map can handle missing values, and provide an experimental evaluation of the performance of the algorithm using synthetic data.
منابع مشابه
Knowledge-intensive subgroup mining: techniques for automatic and interactive discovery
Data mining has proved its significance in various domains and applications. As an important subfield of the general data mining task, subgroup mining can be used, e.g., for marketing purposes in business domains, or for quality profiling and analysis in medical domains. The goal is to efficiently discover novel, potentially useful and ultimately interesting knowledge. However, in real-world si...
متن کاملKnowledge Discovery, Data Mining and Machine Learning Editors
Subgroup mining is a flexible data mining method that considers a given target variable and aims to discover interesting subgroups with respect to this property of interest. In this paper, we especially focus on the handling of continuous target variables: We propose novel formalizations of effective pruning strategies for reducing the search space, and we present the SD-Map* algorithm that ena...
متن کاملAny-time Diverse Subgroup Discovery with Monte Carlo Tree Search
The discovery of patterns that accurately discriminate one class label from another remains a challenging data mining task. Subgroup discovery (SD) is one of the frameworks that enables to elicit such interesting hypotheses from labeled data. A question remains fairly open: How to select an accurate heuristic search technique when exhaustive enumeration of the pattern space is infeasible? Exist...
متن کاملFast Description-Oriented Community Detection using Subgroup Discovery
Communities can intuitively be defined as subsets of nodes of a graph with a dense structure. However, for mining such communities usually only structural aspects are taken into account. Typically, no concise and easily interpretable community description is provided. For tackling this issue, we focus on fast description-oriented community detection using subgroup discovery, cf. [1, 2]. In orde...
متن کاملGeneric Pattern Trees for Exhaustive Exceptional Model Mining
Exceptional model mining has been proposed as a variant of subgroup discovery especially focusing on complex target concepts. Currently, efficient mining algorithms are limited to heuristic (non exhaustive) methods. In this paper, we propose a novel approach for fast exhaustive exceptional model mining: We introduce the concept of valuation bases as an intermediate condensed data representation...
متن کامل